Dynamic Load Balancing for Adaptive Scientific Computations via Hypergraph Repartitioning
نویسندگان
چکیده
Adaptive scientific computations require that periodic repartitioning (load balancing) occur dynamically to maintain load balance. Hypergraph partitioning is a successful model for minimizing communication volume in scientific computations, and partitioning software for the static case is widely available. In this paper, we present a new hypergraph model for the dynamic case, where we minimize the sum of communication in the application plus the migration cost to move data, thereby reducing total execution time. The new model can be solved using hypergraph partitioning with fixed vertices. We describe an implementation of a parallel multilevel partitioning algorithm within the Zoltan load-balancing toolkit, which to our knowledge is the first code for dynamic load balancing based on hypergraph partitioning. Finally, we present experimental results that demonstrate the effectiveness of our approach on a Linux cluster with up to 64 processors. Our new algorithm compares favorably to the widely used ParMETIS partitioning software in terms of quality.
منابع مشابه
A repartitioning hypergraph model for dynamic load balancing
In parallel adaptive applications, the computational structure of the applications changes over time, leading to load imbalances even though the initial load distributions were balanced. To restore balance and to keep communication volume low in further iterations of the applications, dynamic load balancing (repartitioning) of the changed computational structure is required. Repartitioning diff...
متن کاملHypergraph-based Dynamic Partitioning and Load Balancing
1.1 INTRODUCTION An important component of parallel scientific computing is the assignment of work to processors. This assignment problem is also known as partitioning or mapping. The goal of the assignment problem is to find a task-to-processor mapping that will minimize the total execution time. Although efficient optimal solutions for certain restricted variations, such as chain-or tree-stru...
متن کاملGraph Repartitioning with both Dynamic Load and Dynamic Processor Allocation
Dynamic load balancing is an important step conditioning the performance of parallel programs, like adaptive mesh refinement codes. If the global workload varies drastically over time (such that memory is exceeded), it can be relevant to adjust the number of processors while maintaining the load balanced. We propose two different solutions, that extend classic graph repartitioning approaches to...
متن کاملPLUM: Parallel Load Balancing for Adaptive Unstructured Meshes
Mesh adaption is a powerful tool for eecient unstructured-grid computations but causes load imbalance among processors on a parallel machine. We present a novel method called PLUM to dynamically balance the processor workloads with a global view. This paper describes the implementation and integration of all major components within our dynamic load balancing strategy for adaptive grid calculati...
متن کاملExperiments with Repartitioning and Load Balancing Adaptive Meshes
Mesh adaption is a powerful tool for efilcient unstructured-grid computations but causes load imbalance on multiprocessor systems. To address this problem, we have developed PLU M, an automatic portable framework for performing adaptive largescale numerical computations in a message-passing environment. This paper presents several experimental results that verify the effectiveness of PLUM on se...
متن کامل